Before you start.

This week tutorial will be a bit playful, as you will focus on ploting the distribution of a crypto‐zoological species – the North American Sasquatch, or Bigfoot (Imaginus magnapedum). Supposedly. Sasquatch belongs to a large primate lineage descended from the extinct Asian species Gigantopithicus blacki. Sasquatch is regularly reported in forested lands of western North America and is considered a significant indigenous American and western North American folk legend. However, the existence of this creature has never been verified with a typed specimen.

The main objective will be to generate a spatial representation of the complete range of the species. This type of analysis is the starting point to build a statistical model to describe the ” environmental requirements” of a species. These models are often referred to as “species distribution modelling” (SDM) or “ecological niche modelling” (ENM). Regardless of the “tongue-in-cheek” nature of the organisms evaluated, this work will provide you with an opportunity to combine different spatial data in R.

Lozier et al. (2009) undertook a similar task. They build an SDM to show the need for careful evaluation of database records before their use in modelling, especially when the presence of cryptic species (morphologically indistinguishable biological groups that are incapable of interbreeding) is suspected, or many records are based on indirect evidence.

Generating and visualising Spatial* objects to show Bigfoot sightings.

Importing occurrence a data.frame into R is easy. Transforming it into a Spatial* object is also easy. However, collecting, georeferencing, and cross-checking coordinate data is a tedious task. Discussions about species distribution modelling often focus on comparing modelling methods. Still, if you are dealing with species with few and uncertain records, your focus should be on improving the quality of the occurrence data. All modelling methods do better if your occurrence data is unbiased, free of errors, and if you have a relatively large number of records.

In this case, you will have a file (usually a .txt or .csv file) with point locality data. Such datasets include the known distribution of a species and hopefully some extra metadata. Such a file can be stored locally in your drive or an on-line repository (e.g., GBIF, GITHUB, or hub.arcgis.com).

Loading and checking the data.

At this point, you should be able to load spatial data stored locally and from on-line repositories. As a reminder, in R, loading online-data is easy if you have the Uniform Resource Locator (URL) determining the location of the data set, which can be used directly as an argument into the functions read.table() or read.csv().

Your task:

Run the code below to see how you can load data from an online repository

# The URL is data provided by the Bigfoot Field Research Organization and stored in hub.arcgis.com 
Bigfoot.URL <- "https://opendata.arcgis.com/datasets/9947fc49e6c44120b4a1b3133c073dbc_0.csv"

# Load the CSV file using the read.csv() function,
Bigfoot.dta <- read.csv(Bigfoot.URL)

#Print the first six rows of Bigfoot.dta
head(Bigfoot.dta)
##           X        Y OBJECTID_1 FID_ FID_1        name
## 1 -121.8053 48.64056          1    0   891  1994-1997+
## 2 -121.0922 46.98333          2    1   892      Sep-83
## 3 -117.6767 46.33333          3    2   893      Jul-00
## 4 -122.1914 46.04722          4    3   894      Jul-85
## 5 -123.1967 47.56160          5    4   895      Aug-92
## 6 -122.5386 47.00440          6    5   896      Aug-00
##                                                                                                          descriptio
## 1                                      Report 60: Missing Cattle and large footprints found\nClass B; Skagit County
## 2                         Report 77: Couple hear vocalizations while camping at Milk Lake\nClass B; Kittitas County
## 3 Report 89: Fishermen hear large animal crashing through the brush in the Blue Mountains\nClass B; Columbia County
## 4      Report 270: Campers awakened by late-night visitor in the Indian Heaven Wilderness\nClass B; Skamania County
## 5                          Report 283: Couple hear sounds and find footprints near Hoodsport\nClass B; Mason County
## 6         Report 337: Man working his land hears rock against wood sound and distant answer\nClass B; Pierce County
##               timestamp_ ObjectId       Lon      Lat STATE_NAME STATE_FIPS Year
## 1 1994/05/13 00:00:00+00      892 -121.8053 48.64056 Washington         53 1994
## 2 1983/09/01 00:00:00+00      893 -121.0922 46.98333 Washington         53 1983
## 3 2000/07/24 00:00:00+00      894 -117.6767 46.33333 Washington         53 2000
## 4 1985/07/01 00:00:00+00      895 -122.1914 46.04722 Washington         53 1985
## 5 1992/08/29 00:00:00+00      896 -123.1967 47.56160 Washington         53 1992
## 6 2000/08/16 00:00:00+00      897 -122.5386 47.00440 Washington         53 2000
##   ID sourceCountry ENRICH_FID   areaType bufferUnits bufferUnitsAlias
## 1  0            US          1 RingBuffer   esriMiles            Miles
## 2  1            US          2 RingBuffer   esriMiles            Miles
## 3  2            US          3 RingBuffer   esriMiles            Miles
## 4  3            US          4 RingBuffer   esriMiles            Miles
## 5  4            US          5 RingBuffer   esriMiles            Miles
## 6  5            US          6 RingBuffer   esriMiles            Miles
##   bufferRadii                  aggregationMethod HasData FedLandPt CritHabPt
## 1           1 BlockApportionment:Landscape.huc12       1      72.1      51.5
## 2           1 BlockApportionment:Landscape.huc12       1      91.2      19.8
## 3           1 BlockApportionment:Landscape.huc12       1      36.8       1.0
## 4           1 BlockApportionment:Landscape.huc12       1      21.3       7.7
## 5           1 BlockApportionment:Landscape.huc12       1      91.0      76.7
## 6           1 BlockApportionment:Landscape.huc12       1      39.9      40.1
##   NLCDsnIcPt NLCDfrstPt NLCDssgPt NLCDAgPt NLCDWetPt NLCDDevPt
## 1        5.3       72.5      17.3      0.0       0.9       1.7
## 2        0.5       81.0      15.1      0.1       0.4       2.8
## 3        0.1       23.5      54.7     17.9       1.9       1.9
## 4        1.0       68.2      14.6      0.0       0.6       4.5
## 5        3.6       75.0      18.7      0.0       0.3       2.1
## 6        0.1       44.2      17.7     10.6       6.5      20.7

First plot.

As you see from the printout on the last page (and below), there are many variables in this data set.

##           X        Y OBJECTID_1 FID_ FID_1        name
## 1 -121.8053 48.64056          1    0   891  1994-1997+
## 2 -121.0922 46.98333          2    1   892      Sep-83
## 3 -117.6767 46.33333          3    2   893      Jul-00
## 4 -122.1914 46.04722          4    3   894      Jul-85
## 5 -123.1967 47.56160          5    4   895      Aug-92
## 6 -122.5386 47.00440          6    5   896      Aug-00
##                                                                                                          descriptio
## 1                                      Report 60: Missing Cattle and large footprints found\nClass B; Skagit County
## 2                         Report 77: Couple hear vocalizations while camping at Milk Lake\nClass B; Kittitas County
## 3 Report 89: Fishermen hear large animal crashing through the brush in the Blue Mountains\nClass B; Columbia County
## 4      Report 270: Campers awakened by late-night visitor in the Indian Heaven Wilderness\nClass B; Skamania County
## 5                          Report 283: Couple hear sounds and find footprints near Hoodsport\nClass B; Mason County
## 6         Report 337: Man working his land hears rock against wood sound and distant answer\nClass B; Pierce County
##               timestamp_ ObjectId       Lon      Lat STATE_NAME STATE_FIPS Year
## 1 1994/05/13 00:00:00+00      892 -121.8053 48.64056 Washington         53 1994
## 2 1983/09/01 00:00:00+00      893 -121.0922 46.98333 Washington         53 1983
## 3 2000/07/24 00:00:00+00      894 -117.6767 46.33333 Washington         53 2000
## 4 1985/07/01 00:00:00+00      895 -122.1914 46.04722 Washington         53 1985
## 5 1992/08/29 00:00:00+00      896 -123.1967 47.56160 Washington         53 1992
## 6 2000/08/16 00:00:00+00      897 -122.5386 47.00440 Washington         53 2000
##   ID sourceCountry ENRICH_FID   areaType bufferUnits bufferUnitsAlias
## 1  0            US          1 RingBuffer   esriMiles            Miles
## 2  1            US          2 RingBuffer   esriMiles            Miles
## 3  2            US          3 RingBuffer   esriMiles            Miles
## 4  3            US          4 RingBuffer   esriMiles            Miles
## 5  4            US          5 RingBuffer   esriMiles            Miles
## 6  5            US          6 RingBuffer   esriMiles            Miles
##   bufferRadii                  aggregationMethod HasData FedLandPt CritHabPt
## 1           1 BlockApportionment:Landscape.huc12       1      72.1      51.5
## 2           1 BlockApportionment:Landscape.huc12       1      91.2      19.8
## 3           1 BlockApportionment:Landscape.huc12       1      36.8       1.0
## 4           1 BlockApportionment:Landscape.huc12       1      21.3       7.7
## 5           1 BlockApportionment:Landscape.huc12       1      91.0      76.7
## 6           1 BlockApportionment:Landscape.huc12       1      39.9      40.1
##   NLCDsnIcPt NLCDfrstPt NLCDssgPt NLCDAgPt NLCDWetPt NLCDDevPt
## 1        5.3       72.5      17.3      0.0       0.9       1.7
## 2        0.5       81.0      15.1      0.1       0.4       2.8
## 3        0.1       23.5      54.7     17.9       1.9       1.9
## 4        1.0       68.2      14.6      0.0       0.6       4.5
## 5        3.6       75.0      18.7      0.0       0.3       2.1
## 6        0.1       44.2      17.7     10.6       6.5      20.7

For now, you will focus on the positional information stored in the columns X (or Lon) and Y (or Lat). The first step to assess what this information is showing is to display it using a scatter plot.

Your task: Using the data stored in the object Bigfoot.dta in the memory you will: Plot a scatter-plot showing the location (i.e., the Latitude [Y or Lat] and Longitude [X or Lon]) of each Bigfoot sighting.

# Plot the spatial set-up of the observations stored in Bigfoot.dta as red-filled circles.
plot(Bigfoot.dta [, c("Lon","Lat")], # Specify the object with the position information
     pch = 19, # Define the type of point to plot.
     col = "red" # Define colour of the points.
     )

Maping the world.

This first visualisation provides a reasonable frame of reference, but it would be better to include the shape of continental North America for reference purposes. Many such pre-loaded maps are part of R packages (e.g.maptools, maps, rworldmap, ggplot2).

As a start, you will use the wrld_simpl data set part of the maptools package.

Your task:

Explore and Plot the data in the wrld_simpl object (part of the maptools package).

# use the function summary() to explore the information stored in the object
summary(wrld_simpl)
## Object of class SpatialPolygonsDataFrame
## Coordinates:
##    min       max
## x -180 180.00000
## y  -90  83.57027
## Is projected: FALSE 
## proj4string :
## [+proj=longlat +ellps=WGS84 +towgs84=0,0,0,0,0,0,0 +no_defs]
## Data attributes:
##       FIPS          ISO2          ISO3           UN                    NAME    
##         :  3   AD     :  1   ABW    :  1   Min.   :  4.0   Aaland Islands:  1  
##  AC     :  1   AE     :  1   AFG    :  1   1st Qu.:215.0   Afghanistan   :  1  
##  AE     :  1   AF     :  1   AGO    :  1   Median :429.0   Albania       :  1  
##  AF     :  1   AG     :  1   AIA    :  1   Mean   :431.8   Algeria       :  1  
##  AG     :  1   AI     :  1   ALA    :  1   3rd Qu.:650.5   American Samoa:  1  
##  AJ     :  1   AL     :  1   ALB    :  1   Max.   :894.0   Andorra       :  1  
##  (Other):238   (Other):240   (Other):240                   (Other)       :240  
##       AREA              POP2005              REGION         SUBREGION     
##  Min.   :      0.0   Min.   :0.000e+00   Min.   :  0.00   Min.   :  0.00  
##  1st Qu.:     44.5   1st Qu.:1.275e+05   1st Qu.:  2.00   1st Qu.: 14.00  
##  Median :   5515.5   Median :3.193e+06   Median : 19.00   Median : 30.00  
##  Mean   :  52696.1   Mean   :2.464e+07   Mean   : 65.43   Mean   : 54.84  
##  3rd Qu.:  34708.8   3rd Qu.:1.240e+07   3rd Qu.:142.00   3rd Qu.: 61.00  
##  Max.   :1638094.0   Max.   :1.313e+09   Max.   :150.00   Max.   :155.00  
##                                                                           
##       LON               LAT          
##  Min.   :-178.13   Min.   :-80.4460  
##  1st Qu.: -50.16   1st Qu.: -0.3025  
##  Median :  17.66   Median : 16.5110  
##  Mean   :  13.28   Mean   : 16.4289  
##  3rd Qu.:  50.01   3rd Qu.: 39.1067  
##  Max.   : 179.22   Max.   : 78.8300  
## 
# Print the variable names using names()
names(wrld_simpl)
##  [1] "FIPS"      "ISO2"      "ISO3"      "UN"        "NAME"      "AREA"     
##  [7] "POP2005"   "REGION"    "SUBREGION" "LON"       "LAT"
# Look at the first five rows of the Data matrix stored in wrld_simpl using head()
head(wrld_simpl@data)
##     FIPS ISO2 ISO3 UN                NAME   AREA  POP2005 REGION SUBREGION
## ATG   AC   AG  ATG 28 Antigua and Barbuda     44    83039     19        29
## DZA   AG   DZ  DZA 12             Algeria 238174 32854159      2        15
## AZE   AJ   AZ  AZE 31          Azerbaijan   8260  8352021    142       145
## ALB   AL   AL  ALB  8             Albania   2740  3153731    150        39
## ARM   AM   AM  ARM 51             Armenia   2820  3017661    142       145
## AGO   AO   AO  AGO 24              Angola 124670 16095214      2        17
##         LON     LAT
## ATG -61.783  17.078
## DZA   2.632  28.163
## AZE  47.395  40.430
## ALB  20.068  41.143
## ARM  44.563  40.534
## AGO  17.544 -12.296
# Check what type of object is wrld_simpl using class()
class(wrld_simpl)
## [1] "SpatialPolygonsDataFrame"
## attr(,"package")
## [1] "sp"
# plot wrld_simpl using plot
plot(wrld_simpl)

Extracting North America.

The object named wrld_simpl is a SpatialPolygonsDataFrame with multiple attributes associated to each of the Poligons bounded together - this is the information you summarised with function summary().

In this case, you are not interested in the entire globe, but rather only the three “North American” countries: Canada, United States, and Mexico. One way to generate a new SpatialPolygonsDataFrame only for North American countries is Sub setting wrld_simpl using the NAMES variable as you would do in a data.frame.

Your task:

Subset wrld_simpl so the new SpatialPolygonsDataFrame only has the data and spatial presentation for Canada, United States, and Mexico.

# Define a TRUE/FALSE vector that assesses if each name in wrld_simpl NAME variables is either "Canada", "United States"or "Mexico".
is.NoAmCount <- wrld_simpl@data[, "NAME"]%in%c("Canada", "United States", "Mexico")

# Sub setting wrld_simpl using the is.NoAmCount vector.
NoAmPoly <- wrld_simpl[is.NoAmCount, ]
# Show the current names in the "reduced" SpatialPolygonsDataFrame.
NoAmPoly$NAME
## [1] Canada        Mexico        United States
## 246 Levels: Aaland Islands Afghanistan Albania Algeria ... Zimbabwe

The summary of NoAmPoly shows multiple features, one per country. When NoAmPoly is plotted over Bigfoot.dta it provides spatial context to Bigfoot sightings.

Your task:

Plot the location (i.e., the Latitude [Y or Lat] and Longitude [X or Lon]) of each Bigfoot sighting OVER the reference map with only North American countries.

# Plot Bigfoot sightings as red-filled circles.
plot(Bigfoot.dta[, c("Lon", "Lat")] , # Define the object to plot.
     pch = 19,  # Define the marks are filled circles.
     col = "red" # Define the colour of the marks
     )
# add the Reference map - only North American countries.
plot(NoAmPoly , # Define the SpatialPolygonsDataFrame to plot. 
   add = T # Should the Geometry be added to the current plot?
   )

Geographic transformations for polygons.

Like many other maps, the maps you have plotted so far represent a distorted view of the space. Why is that?, because the Mercator projection (the one used most often when the information is measured in Latitude and longitude) make areas near to the Earth’s polar regions (Greenland, Alaska and Antarctica, for example) look much larger than they are relative to areas nearer to the equator. This is because Mercator’s way of flattening the globe involves stretching the far northern and far southern parts of the world out until a flat, rectangular map is achieved.

The question is how to pick the “right” way to represent the area of interest. There is no clear-cut answer to this, as it depends on the mapping exercise objective. For example, do you want to focus on a small area or an entire continent?. Lucky for you, mapping agencies everywhere have often already flagged a reasonable projection for whatever region/country you are interested in, so your first step is to look for that with a quick Internet search.

A version of Albers Equal Area projection is optimised for North America’s characteristics (wider east to west than north to south extents) for the study region. If South America were the focal region, you would need a different projection optimised to represent an area larger north to south than broader east to west.

With this information, you can look for the PROJ string in either the PRøj https://proj.org/ or Spatial Reference https://spatialreference.org/ websites and re-project your North America map using the spTransform() function from the sp package.

Your task:

You will now use the spTransform() function from the sp package to re-project the wrld_simpl the world map to an Albers Equal Area projection for North America.

Once re-projected, plot the resulting SpatialPolygonsDataFrame to assess how the visualisation of wrld_simpl changed.

# Define the modified Albers Equal Area projection for North America <//epsg.io/42303.proj4>
CRS.String <- "+proj=aea +lat_1=20 +lat_2=60 +lat_0=40 +lon_0=-96 +x_0=0 +y_0=0 +datum=NAD83 +units=m +no_def"

# Using the spTransform function to change the projection.
wrld.AEA <- spTransform(x = wrld_simpl, # The object to "re-project".
                         CRSobj = CRS(CRS.String) # the new projection as a CRS object.
                             )
# Plot the re-projected map.
plot(wrld.AEA)

Projecting North America.

You see now that the focal area of the map is now North America (it is at the centre of the figure), with other areas distorted to the point of not being recognisable in many cases (i.e., Russia or China).

Now, to accurately represent the region of interest, you can subset wrld.AEA to only the North American Countries (Canada, United States, and Mexico).

Your task:

Generate a project map for North America by sub setting wrld.AEA to only have the information for Canada, United States, and Mexico.

after this, plot your re-projected North American Map.

# Define a TRUE/FALSE vector that assesses if each name in wrld_simpl NAME variables is either "Canada", "United States"or "Mexico"
is.NoAmCount <- wrld.AEA@data[, "NAME"] %in% c("Canada", "United States", "Mexico")

# Sub setting wrld.AEA using the is.NoAmCount vector.
NoAmPoly.AEA <- wrld.AEA[is.NoAmCount, ]

# Plot the re-projected map.
plot(NoAmPoly.AEA)

Alternatively, you can re-project the SpatialPolygonsDataFrame with only North American Countries (NoAmPoly). An example of how this can be done is below.

# Approach two: Re-project NoAmPoly
# Using the spTransform function to change the projection.
NoAmPoly.AEA <- spTransform(x = NoAmPoly , # The object to "re-project".
                            CRSobj = CRS(CRS.String) # the new projection as a CRS object.
                            )
# Plot the re-projected map.
plot(NoAmPoly.AEA)

Projecting Bigfoot data - create a spatial object.

The base map is now perfect to represent the area of interest so that distances and areas are accurately represented. But what about the observations? These are still in Latitude and Longitude (only georeferenced). Also, it is still a data.frame and not a Spatial* object.

Your task:

Using the function coordiates() create a SpatialPointDataFrame based on the information in Bigfoot.dta.

Here you will use two variables (Lon and Lat) in Bigfoot.dta to define the spatial information.

Once you are done creating this SpatialPointDataFrame, print its’ summary.

# Copy the data.frame to another object so the original information is available after the object transformation
Bigfoot.SpaPnt <- Bigfoot.dta

# Use the coordinates function to transform Bigfoot.SpaPnt from a data.frame to a SpatialPointDataFrame 
coordinates(Bigfoot.SpaPnt) <- ~ Lon + Lat # note that the transformation to a SpatialPointDataFrame is done by defining a formula were the coordinates are the predictors.

# What is the class of the object after using coordinates?
class(Bigfoot.SpaPnt)
## [1] "SpatialPointsDataFrame"
## attr(,"package")
## [1] "sp"
# Plot the SpatialPointDataFrame..
plot(Bigfoot.SpaPnt,
     pch = 19,  # Define the marks are filled circles.
     col = "red" # Define the colour of the marks
     )

You could also do the operation above by creating a SpatialPoints object and merging a data.frame to it, or using the SpatialPointDataFrame() function. The code below shows how this is done.

Bigfoot.Pnt <- SpatialPointsDataFrame(coords = Bigfoot.dta[, c("Lon", "Lat")] ,# Define the coordinates for each observation
                      proj4string = CRS("+proj=longlat +datum=WGS84"), # Define the projection - It should be WGS84 as you have Lat/long Info.
                      data = Bigfoot.dta # Define the data.frame to be appended.
                      )
# Plot the SpatialPointDataFrame..
plot(Bigfoot.Pnt,
     pch = 19,  # Define the marks are filled circles.
     col = "red" # Define the colour of the marks
     )

Projecting Bigfoot data - Projectiong spatial points

Once you have this SpatialPointDataFrame, plotting it would produce a very similar map to the one created with Bigfoot.dta, but without the coordinates and “stretched” east-to-west. This is what you need to

Your goal now is to re-project the sightings points to the “best” projection for the region of interest. Like before, this is using the spTransform() function from the sp package. But before you do this, check the Bigfoot.SpaPnt projection using the proj4string() function. For this just run the code below and ignore the warning.

#check the projection of `Bigfoot.SpaPnt` 
proj4string(Bigfoot.SpaPnt)
## [1] NA

Now that you know the map is in geographical coordinates, lets project it to an Albers Equal Area projection. One thing before we move on, if there was no coordinate system in the object (the case if the object is created using coordinates()) you would need to add a initial projection manually.

Your task:

Use the function spTransform() to re-project Bigfoot.SpaPnt to an Albers Equal Area projection.

A better map.

With that last step, you are ready to generate a nicer looking map of the region of interest and the recorded Bigfoot sightings.

Your task:

Run the code below to plot your projected Bigfoot sightings object (Bigfoot.SpaPnt) OVER your projected map of North American counties (NoAmPoly.AEA).

# Define the plotting space [see ?par so check what xaxs = "i" and yaxs = "i" do]
par(mar = c(2, 2, 6, 2),
    xaxs = "i", yaxs = "i")

# Plot the Area of interest making the continents light-grey, the boundaries black, and the oceans light-blue.
plot(NoAmPoly.AEA , # Define the SpatialPolygonsDataFrame to plot - projected North America
     col = 'lightgrey', # make the continental areas light-grey
     bg = 'lightblue', # Make the oceans blue
     xlim = range(coordinates(Bigfoot.SpaPnt.AEA)[, 1]), # Set limits to zoom into the region with points
     ylim = range(coordinates(Bigfoot.SpaPnt.AEA)[, 2]), # Set limits to zoom into the region with points
     main = "Bigfoot sigthings\n[1869 to 2017]" ,
   cex.main = 1.2)

# Add a bounding box, some tick marks to define the latitude-longitude point
box(); axis(1, labels = NA); axis(2, labels = NA); axis(3, labels = NA); axis(4, labels = NA)
# Note that the commands above are separated with a semicolon (;) as this allows to put consecutive functions in one line.

# Plot Bigfoot sightings as a red-filled points
points(Bigfoot.SpaPnt.AEA, # Define the SpatialPointsDataFrame to plot - Projected Big foot obs
       pch = 19, # Define the mark shape.
       col = "red", # Define the mark colour.
       cex=0.7 # Define the size of the dots.
       )

This map is clearly better than that first raw visualization!!!

Advanced maps.

One of the great things about R is that you can make more than one type of visualization, particularity is you are making an interactive document.

Below is an example of such interactive map dome with the mapview package wich builds on leaflet.

require(mapview)
mapview(Bigfoot.SpaPnt, xcol = "Longitude", ycol = "Latitude", grid = T,cex= 2,col.regions="red")

Loading and processing Raster* objects.

Now that you know where Bigfoot has been “sighted” across North America is time to get environmental information to define the conditions that determine the best habitats for Bigfoot. This is the objective of species distribution modelling (SDM). In SDMs, environmental information is usually extracted from raster files stored in some kind of GIS format. Variables can include climatic, soil, terrain, vegetation, land use, and other variables.

Get climate data.

As a start, you will focus on climatic data for the region of interest. There are many sources of climatic information freely available to be used. Some examples are the Climatic Research Unit (CRU http://www.cru.uea.ac.uk/), Climatologies at high resolution for the Earth’s land surface areas (CHELSA https://chelsa-climate.org/), WorldClim v2. (https://www.worldclim.org/), ecoClimate (https://www.ecoclimate.org/), global climatologies for bioclimatic modelling (CliMond https://www.climond.org/) and climate re-analyses such as ERA5-Land (https://bit.ly/39kQlOL).

For simplicity, you will focus now on WorldClim climatologies and obtain the 19-bioclimatic variables commonly used in SDMs. These can be downloaded and stored locally. Some of these can be loaded directly from R with the use of packages. For example, WorldClim data can be obtained directly from R using the getData() function of the raster package. Climate data from CRU can be obtained directly from R using the create_CRU_stack() function of the getCRUCLdata package. For an example of how this can be done see the code below.

# Now use the function getData() to download the 19-BioClimatic part of WorldClim at a 10ArcMis resolution.
WC.Bioclim <- getData(name = 'worldclim', # Define the Data set name.
                      res = 10, # This defines the resolution of the Imported raster (10ArcMis)
                      var = 'bio') # This defines that BioClimatic variables are imported
# Look at the names of the variables in WC.Bioclim.
names(WC.Bioclim)

Visualize the climate data.

By using getData() you get a number of files that are downloaded into a folder named wc10 in the working working directory. These are ESRI .hdr Labelled files named bio1 to bio19.

You will not do this, as I have downloaded it as is available in your session. The object name is WC.Bioclim as it is a RasterStack with 19 layers, coded Bioclim.1 to Bioclim.19 (see https://www.WorldClim.org/data/bioclim.html for a translation of the codes). But, how do they look? The only way to see that is to plot the variables.

Your task:

Plot the Bioclim.1 (Annual Mean Temperature [C]) and Bioclim.12 (Total Annual Precipitation [mm/yr]) layers of WC.Bioclim.

The maps should be shown as two panels in the same figure.

# Define a plotting space with One-Row and Two-columns
par(mfrow = c(1, 2))
# plot Annual Mean Temperature (WorldClim temp are C*10 so you need to do some operations before plotting).
# Define a divergent blue-to-grey-to-red HCL palette with 50 values.
plot(WC.Bioclim[['Bioclim.1']]/10, # Define the Layer from WC.Bioclim to plot. I divide the value by 10 as WorldClim temp are C*10 (to avoid using Floating Numbers)
     main = "Annual Mean Temperature\n[C]", # Set the Figure title
     cex.main = 1.2, # Define the Title font size
     col = hcl.colors(n = 50 , palette = "Blue-Red") # Define a divergent HCL palette with 50 values.
     )
# plot Total Annual Precipitation 
# Define a divergent red-to-grey-to-blue HCL palette with 50 values.
plot(WC.Bioclim[['Bioclim.12']], # Define the Layer from WC.Bioclim to plot.
     main = "Total Annual Precipitation\n[mm/yr]", # Set the Figure title
     cex.main = 1.2,# Define the Title font size
     col = rev(hcl.colors(n = 50 ,palette = "Blue-Red")) # Define a divergent HCL palette with 50 values.
     )

Map algebra - 1

For clarity, “bioclimatic” summaries represent annual trends (e.g., mean annual temperature, annual precipitation), seasonality (e.g., annual range in temperature and precipitation) and extreme or limiting environmental factors (e.g., the temperature of the coldest and warmest month, and precipitation of the wet and dry quarters).

The best way to know what information is summarised in these variables is to build them yourself, so this is your next task. Specifically, you will now create Bioclim-7 [Temperature Annual Range] as an example.

For this, you will need the monthly maximum and minimum temperatures, which can be downloaded using the var = 'tmin' and var = 'tmax' arguments on the getData() function of the raster package. This data is now loaded in the memory.

Both WC.Tmin and WC.Tmax are RasterStacks objects with 12 bands (one for each one of the months). The first step to build a raster with the Temperature Annual Range (Bioclim-7) is to estimate the Maximum Temperature of Warmest Month (Bioclim-5) and the Minimum Temperature of Coldest Month (Bioclim-6).

These can be estimated directly on a the RasterStack using the max() and min() functions. Alternatively, you can use the function calc() from the raster package. This function allows for more elaborate calculations as you can define more complex functions with the data within a single cell in a multi-band raster (see ?calc for more info on the function).

Your task:

Using the monthly minimum (WC.Tmin) estimate the minimum temperature of the coldest month (the minimum value in WC.Tmin) and plot it. Remmebr that temperatures here are C x 10.

#Estimate the minimum temperature of coldest month 
WC.MinTCoMo <- min(WC.Tmin)

#plot the minimum temperature of coldest month 
plot(WC.MinTCoMo/10)

Your task:

Using the monthly maximum (WC.Tmax) estimate the maximum temperature of the warmest month (the maximum value in WC.Tmax) and plot it.

# Estimate the maximum temperature of warmest month
WC.MaxTWaMo <- max(WC.Tmax)

#plot the maximum temperature of coldest month 
plot(WC.MaxTWaMo/10)

Map algebra - 2

Doing the operations above transforms WC.Tmax and WC.Tmin from a RasterStack with values for each month to a RasterLayer that only has the maximum or minimum monthly temperature values (that is, a one layer raster).

With this information, it is now possible to generate an Annual Temperature Range (Bioclim-7) RasterLayer. For this, you will subtract the Minimum Temperature of Coldest Month (Bioclim-6; WC.MinTCoMo) from the Maximum Temperature of the Warmest Month Bioclim-5; WC.MaxTWaMo). This can be done using the RasterLayers directly or the calc() function if you bundle WC.MinTCoMo and WC.MaxTWaMo in one RasterStack.

Your task:

Using your estimates of the maximum temperature of the warmest month (WC.MinTCoMo) and the minimum temperature of the coldest month (WC.MaxTWaMo), estimate the Temperature Annual Range.

Once you have done this, plot your Temperature Annual Range estimation.

#Estimate the minimum temperature of coldest month 
WC.MinTCoMo <- min(WC.Tmin)
# Estimate the maximum temperature of warmest month
WC.MaxTWaMo <- max(WC.Tmax)

Now you know how these bioclimatic variables are generated. But the process above is long and tedious, done only to understand how raster operations work. When you face a similar task in the future, you could use thebioclim() function for the dismo package. I leave that task to the interested student.

Graphical representation of Raster* objects.

You have a set of predictor variables as rasters in the WC.Bioclim object. Your next task is mapping the climatic predictors as done with the occurrences. The first step will be cropping the global climate maps to the region of interest (North America). You could do this using the crop() function of the raster package and defining the cutting region using the Bigfoot.SpaPnt object.

Your task:

Crop the WC.Bioclim raster using the spatial extent of your georeferenced observations (Bigfoot.SpaPnt).

After you have done this, plot the resulting RasterLayer.

# Crop the WC.Bioclim raster using the observations spatial extent
NoAm.WC.Bioclim <- crop(x= WC.Bioclim, # Define the Raster to crop
                        y = Bigfoot.SpaPnt # Define the spatial object that determine the extent to be cropped
                        )
# Plot the cropped Raster
plot(NoAm.WC.Bioclim)

Projecting Raster* objects.

As you see from the plot above, the result of the crop() function is a subset of the 19-BIOCLIM variables for continental North America. One more thing here, the plot function in a RasterStack only plots up to 16-layers. However, like your first observations map, the visualisation does not look very “pretty”. So the first thing to do is re-project these to an Albers Equal Area projection using the projectRaster() function.

Your task:

Using the function projectRaster(), re-project your North America cropped bioclimatic variables raster (NoAm.WC.Bioclim) to an Albers Equal Area projection.

After you have done this, plot the resulting RasterLayer.

# project the NoAm.WC.Bioclim raster
NoAm.WC.Bioclim.AEA <- projectRaster(from = NoAm.WC.Bioclim, # Define the Raster to be projected.
                                     crs = CRS(CRS.String) # Define the projection as a CRS object.
                                     )
# Plot the re-projected Raster
plot(NoAm.WC.Bioclim.AEA)

A better Graphical representation of Raster* objects.

Now you have a series of projected rasters that can adequately show the climatic conditions in the region of interest. You will now focus on Annual Mean Temperature (bio1) as an example of producing an “almost” publication ready map. You can do this in many (and more straightforward) ways, but here you will use a more “laborious” approach, so you know how to use alternative plotting parameters.

Your task:

Generate the best visualisation for Annual Mean Temperature (bio1) using the projected Bioclimatic raster (NoAm.WC.Bioclim.AEA) by taking the following steps:

  • Define a plotting region so that the map is bounded within a region.

  • Plot the raster of Annual Mean Temperature using a Yellow-to-Red colour ramp palette, setting the minimum and maximum values for which colours should be plotted to -20 and 30.

  • Add a bounding box and axes with only tick-marks.

  • Add your projected map (NoAmPoly.AEA) of the region of interest, the projected observation points (Bigfoot.SpaPnt.AEA), and a legend to say what the plotted symbols are.

# Step 1: define a plotting region so that the maps do not get affected by how big is the plotting space
par(mfrow=c(1,1), 
    mar=c(2,2,4,2), # Defines the margins of the plotting space
    fig=c(0,1,0.1,1),# this argument defines a smaller figure space
    new=F) # this argument states that a blank plotting space is used 

# create a new plotting space
plot.new()

# Define the "limits" of the plotting space
plot.window(xlim = extent(Bigfoot.SpaPnt.AEA)[1:2], # get the xlim form the projected Bigfoot sightings
            ylim = extent(Bigfoot.SpaPnt.AEA)[3:4]) # get the ylim form the projected Bigfoot sightings)

# Step 2: plot the raster
# add the raster of Annual Mean Temperature
image(NoAm.WC.Bioclim.AEA[[1]]/10, # Define the Raster Layer to plot. You know why you divide by 10 here
      col = hcl.colors(100, "YlOrRd", rev = TRUE), # Define a colour ramp palette
      zlim = c(-20,30), # define the range of values to plot 
      add=T # Add this figure to the current plot?
      )
# Add a Main title manually
mtext("Annual Mean Temperture [C]", # Define the main title text
      side = 3, # Define the margin to locate the text.
      cex = 1.5, # Define the text size
      font = 2, # Define is the text is bold
      line = 1.5 # Define the lien to add the text
      )

# Step 3: Add area elements
#Add a bounding box

# Add a bounding box and the axes with ticks
box(); axis(1, labels = NA); axis(2, labels = NA); axis(3, labels = NA); axis(4, labels = NA)

# Step 4: Add other elements
# Add a map of the region of interest
# Plot the Area of interest
plot(NoAmPoly.AEA, # Define the SpatialPolygonsDataFrame to be plotted.
     add=T # Add this figure to the current plot?
     )
# Plot Bigfoot sightings
points(Bigfoot.SpaPnt.AEA, # Define the SpatialPointsDataFrame to plot.
       pch = 10, # Define the shape of the mark.
       col = "black" # Define the colour of the mark.
       )

# Add a legend to say what the symbols are - Manually
legend("bottomleft",
       pch=10,
       legend = "Bigfoot\nsightings",
       bty="n")

# Step 4: Add a colour ramp by hand
par(fig=c(0,1,0.01,0.13),# this argument defines a smaller figure space 
  mar=c(2,2,1,2),
  new=T) # this argument states that a black plotting space is used 
# Plot a matrix with all possible values
image((matrix(1:100)),
      axes=F,
      col = hcl.colors(100, "YlOrRd", rev = TRUE))
# Add a bounding box
box()
# Add an axis with the values
axis(1,
     at = seq(0,1, length.out = 6),
     labels = c(-2:3)*10)

# Add a main title.
mtext("Annual Mean Temperture [C]",
      side = 3,
      cex = 1,
      line = 0.3)

par(new=F,
    fig=c(0,1,0,1)) # this argument states that a black plotting space is used 

Extract values from a Raster* object to use as predictors

The next step is to extract the values of the predictors at the locations where Bigfoot has been sighted. That is, get data about the climate that the species apparently likes. This is a very straightforward thing to do using the extract() function from the raster package. The question is which Bigfoot sighting object should you use: Bigfoot.dta, Bigfoot.SpaPnt, or Bigfoot.SpaPnt.AEA. The answer to this is given by the reference system of WC.Bioclim.

Your task:

Using the function projection(), print the projection string of the WC.Bioclim raster.

# Extract the projection
projection(WC.Bioclim)
## [1] "+proj=longlat +datum=WGS84 +no_defs"

As you see, this is a Mercator projection, so you could use Bigfoot.dta or Bigfoot.SpaPnt to extract the values for each of the sites where Bigfoot has been sighted. If you instead used NoAm.WC.Bioclim.AEA as your source raster, you would need to use Bigfoot.SpaPnt.AEA to get the data. With that clear, now you will extract the environmental information.

Your task:

Using the function extract() get the values for the 19 Bioclimatic variables in WC.Bioclim for each Bigfoot sightings. for this use the Bigfoot.SpaPnt spatial object.

# The same as above but using the SpatialPointDataFrame
BFC <- extract(x = WC.Bioclim ,# Define the RasterStack with the Environmental information.
                y = Bigfoot.SpaPnt # Define the object with the locations to get the data
                )
# Print the resulting data.frame head.
head(BFC)
##      Bioclim.1 Bioclim.2   Bioclim.3 Bioclim.4 Bioclim.5 Bioclim.6 Bioclim.7
## [1,]  50.66667  125.8889 -18.7777786 1675.6666 255.88889  47.55556  52.11111
## [2,]  37.66667  119.5556 -35.3333321 1727.0000 285.77777  29.11111  62.44444
## [3,]  66.00000  156.5556 -20.3333340  642.2222  85.66666  18.00000  40.44444
## [4,]  73.22222  142.7778  12.0000000 2503.5557 402.44446  35.55556  62.00000
## [5,]  51.44444  116.4444  -0.3333333 2295.1111 371.77777  46.66667  59.66667
## [6,] 104.44444  168.1111  44.6666679 1236.7778 202.11111  21.55556  62.33333
##      Bioclim.8 Bioclim.9 Bioclim.10 Bioclim.11 Bioclim.12 Bioclim.13 Bioclim.14
## [1,]  742.0000 168.66667  187.77777   679.0000   88.22222   34.44444   5752.222
## [2,]  826.1111 134.44444  143.77777   767.7778   91.77778   33.66667   6148.333
## [3,]  252.0000  73.44444   84.66666   233.3333  131.11111   38.22222   6936.333
## [4,] 1178.8889 176.44444  191.11111  1097.1111  100.66666   40.22222   5209.556
## [5,] 1051.5555 179.22223  200.66667   983.1111   82.44444   38.11111   4709.111
## [6,]  587.3333  91.22222   91.22222   537.8889  103.33334   42.22222   4913.778
##      Bioclim.15 Bioclim.16 Bioclim.17 Bioclim.18 Bioclim.19
## [1,]   195.6667 -57.333332   253.0000 -13.888889   124.8889
## [2,]   193.5556 -74.222221   267.7778 -30.333334   117.3333
## [3,]   266.8889 -71.111115   338.0000 -14.666667   154.2222
## [4,]   222.7778 -25.666666   248.4444  16.444445   141.2222
## [5,]   180.3333 -34.222221   214.5556   3.555556   113.2222
## [6,]   244.8889   3.666667   241.2222  49.222221   168.1111

Checking the Extraction of values from a Raster* object to use as predictors

One important check is whether there are observations with no environmental data (as these could be located in areas where the RasterStack has no data). You can do this using a simple apply loop.

Your task:

Using the function apply(), define the sightings with no bioclimatic information (rows with NA).

# Determine the rows where the variables have NA as a value.
apply(BFC ,
      MARGIN = 2, # apply the function to each column
      function(x){
        which(is.na(x))})# test for each column the rows with no values
## integer(0)

The code above shows that rows 3595 and 3682 do not have environmental data. To see where they are, you can plot the “nice map” again at the end of the previous section modifying the points() function to mark the locations with no environmental data.

Your task:

Plot the “nice map” at the end of the previous section, modifying the points() function to mark the locations with no environmental data as black filled points.

#Plot the are of interest, observations and Latitudes and Longitudes lines
# define the plotting area
par(mar = c(2, 2, 6, 2), xaxs = "i", yaxs = "i")
# Plot the Area of interest
plot(NoAmPoly.AEA, 
     col = 'lightgrey', # make the continual areas light-grey
     bg = 'lightblue', # Make the oceans blue
     xlim = range(coordinates(Bigfoot.SpaPnt.AEA)[, 1]), # Set limits to zoom into the region with points
     ylim = range(coordinates(Bigfoot.SpaPnt.AEA)[, 2]), # Set limits to zoom into the region with points
     main = "Bigfoot sigthings\n[1869 to 2017]", 
     cex.main = 1.2)
box()
# Add some tick marks to define the latitude-longitude point
# Note that these commands are separated with a semicolon (this allows to put consecutive functions in one line)
axis(1, labels = NA); axis(2, labels = NA); axis(3, labels = NA); axis(4, labels = NA)
# Plot Bigfoot sightings
points(Bigfoot.SpaPnt.AEA ,
       pch=19,
       col = "red")
# Show the sites with no environmental data
points(Bigfoot.SpaPnt.AEA[c(3595, 3682), ] ,
       pch = 19, 
       cex=1.5,
       col = "black")

This figure shows that a couple of points in North Carolina do not have environmental data and should be removed from the analysis.

Last Points.

The work you have done today is the type of work you would do in an actual project. You have explored your observations and predictors, then matched these two together, and finally generated a model to test a hypothesis and make some predictions.